Dynamic Join Order Optimization for SPARQL Endpoint Federation

نویسندگان

  • Hongyan Wu
  • Atsuko Yamaguchi
  • Jin-Dong Kim
چکیده

The existing web of linked data inherently has distributed data sources. A federated SPARQL query system, which queries RDF data via multiple SPARQL endpoints, is expected to process queries on the basis of these distributed data sources. During a federated query, each data source may consist of a search space of nontrivial size. Therefore, finding the optimal join order to minimize the size of intermediate results from different sources is key to optimizing the performance of such federated queries. In this study, we present a dynamic optimization approach to determining join order, which can find more optimized join plans than static optimization approaches. Our experimental results show that our proposed approach stably improves the performance of a federated query as the query becomes increasingly complex.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UPSP: Unique Predicate-based Source Selection for SPARQL Endpoint Federation

Efficient source selection is one of the most important optimization steps in federated SPARQL query processing as it leads to more efficient query execution plan generation. An over-estimation of the data sources will generate extra network traffic by retrieving irrelevant intermediate results. Such intermediate results will be excluded after performing joins between triple patterns. Consequen...

متن کامل

SPLENDID: SPARQL Endpoint Federation Exploiting VOID Descriptions

In order to leverage the full potential of the Semantic Web it is necessary to transparently query distributed RDF data sources in the same way as it has been possible with federated databases for ages. However, there are significant differences between the Web of (linked) Data and the traditional database approaches. Hence, it is not straightforward to adapt successful database techniques for ...

متن کامل

An Evaluation of SPARQL Federation Engines Over Multiple Endpoints

Due to decentralized and linked architecture underlying Linking Data, running complex queries often require collecting data from multiple RDF datasets. The optimization of the runtime of such queries, called federated queries, is of central importance to ensure the scalability of Semantic-Web and Linked-Data-driven applications. This has motivated a considerable body of work on SPARQL query fed...

متن کامل

Exploiting the query structure for efficient join ordering in SPARQL queries

The join ordering problem is a fundamental challenge that has to be solved by any query optimizer. Since the high-performance RDF systems are often implemented as triple stores (i.e., they represent RDF data as a single table with three attributes, at least conceptually), the query optimization strategies employed by such systems are often adopted from relational query optimization. In this pap...

متن کامل

Supporting Evacuation Missions with Ontology-Based SPARQL Federation

We study ontology-based SPARQL federation in support of coordinated action by deployed units in military operations. It is presumed that bandwidth is limited and unstable. Thus, we need an approach that generates few HTTP requests. Existing techniques employ join-order heuristics that may cause requests to multiply as a factor of the number of joins in a query. This can easily lead to an amount...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015